p
y
g
p
me. When this happens, gene expressions across replicates will
heterogeneous distribution and only a small proportion of
ns show an inconsistent trend. For instance, most replicates show
ressions, but a few show very low expressions. A heterogeneous
n distribution often happens in disease-related experiments such
ials, drug resistance and cancer diagnosis research. Evidence can
in the literature regarding the heterogenous pattern of gene
n profile [Miyachi, et al., 1993; Wani, et al., 1993; Hess, et al.,
zat, et al., 1995; Suzuki, et al., 1998; Knaust, et al., 2000;
ma, et al., 2000; Ebina, et al., 2001; Makhijani, et al., 2018;
ya, et al., 2020]. A differentially expressed gene with some outlier
n(s) present is called a heterogeneous differentially expressed
d the expressions of such a gene are called the heterogenous
ns.
consequence of heterogeneous expressions is the potential
of the Type I error rate or the Type II error rate. Because of the
neous expressions, the discovery of DEGs is challenged when
common methods such as the t test or the modified t test. In
the conventional outlier test approaches [Dixon, 1950; Grubbs,
xon, 1951] may not be efficient. An efficient way is to embed an
etection component into the DEG discovery process for a robust
covery.
ample of heterogeneous gene expression
9 is a data set used for breast cancer diagnosis and it is composed
rmal tumour samples and 14 cancer samples [Tripathi, et al.,
is interesting to know how heterogenous gene expression
s the DEG discovery for this data set. First, a matrix of the normal
and a matrix of the cancer replicates of the data were extracted
whole data spreadsheet. For each gene, a p value was obtained
t test between the normal replicates and the cancer replicates.
alue was called a raw p value. Afterwards, whether the cancer